step size
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > New York > Rensselaer County > Troy (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (9 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.67)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > New York > Rensselaer County > Troy (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- (15 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.67)
6d0bf1265ea9635fb4f9d56f16d7efb2-Supplemental-Conference.pdf
Supplementary Materials for "Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models" Appendix A The Algorithm Appendix B Convergence Rates Appendix B.1 Rate of Convergence for Strongly Convex Functions Appendix B.2 Rate of Convergence for Convex Functions Appendix B.3 Rate of Convergence for Functions Satisfying the PL Condition Appendix B.4 Common Lemmas Appendix B.5 The Polyak Step Size is Bounded Appendix C Experimental details Appendix D Plots Completing the Figures in the Main Paper Appendix D.1 Comparison between PoNoS and the state-of-the-art Appendix D.2 A New Resetting Technique Appendix D.3 Time Comparison Appendix D.4 Experiments on Convex Losses Appendix D.5 Experiments on Transformers Appendix E Additional Plots Appendix E.1 Study on the Choice of c: Theory (0.5) vs Practice (0.1) Appendix E.2 Study on the Line Search Choice: V arious Nonmonotone Adaptations Appendix E.3 Zoom in on the Amount of Backtracks Appendix E.4 Study on the Choice of η In this section, we give the details of our proposed algorithm PoNoS. Training machine learning models (e.g., neural networks) entails solving the following finite sum problem: min Before that, we establish the following auxiliary result. The following Lemma shows the importance of the interpolation property. Lemma 4. W e assume interpolation and that f Let us now analyze case 2). Let us now show that b < 1. B.2 Rate of Convergence for Convex Functions In this subsection, we prove a O ( The above bound will be now proven also for case 2).
6d0bf1265ea9635fb4f9d56f16d7efb2-Paper-Conference.pdf
Recent works have shown that line search methods can speed up Stochastic Gradient Descent (SGD) and Adam in modern over-parameterized settings. However, existing line searches may take steps that are smaller than necessary since they require a monotone decrease of the (mini-)batch objective function.
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- North America > Canada > British Columbia (0.04)
- Europe > Russia (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- North America > United States (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)